26 research outputs found

    Set up your own bioinformatics server: Chipster in EGI Federated Cloud

    Get PDF
    Chipster is an easy to use data analysis platform for bioinformatics. It provides an uniform graphical interface for over 360 commonly used bioinformatics tools including several R/Bioconductor-based tools and standalone programs (i.e. BWA, TopHat). Chipster is based on a client-server system where the user runs locally a Chipster-client that submits analysis tasks to a Chipster server. Even though Chipster is an open source tool, there is no public Chipster server that would be open for everybody. Due to that, a researcher needs to have an access to some of the existing Chipster servers to be able to use this platform. Alternatively, a researcher can set up his own Chipster server. In this paper, we describe how a Chipster server can be launched EGI Federated Could environment, that provides resources for all European researchers. With the instructions provided here, any European researcher can launch and manage his own Chipster server, suited for needs of a small research group or a bioinformatics course. The setup described here is based on a collaboration of several European instances. Chipster is developed by CSC – IT Center for Science Ltd. in Finland. European Grid Infrastructure (EGI) has fitted Chipster to cloud environment and provides the cloud computing resources. Finally, Rutherford Appleton Laboratory hosts the CVMFS server that provides the scientific tools and data sets for the Chipster servers running in EGI federated cloud

    Antitumor effects of table grape extracts

    Get PDF
    Grape (Vitis vinifera L.) is a fruit rich in polyphenols, bioactive compounds able to prevent cancer, reduce tumorigenesis, and influence critical cancer-related pathways. This research shows the main results obtained in our previous works: 1) the characterization of the polyphenolic content and antioxidant activity of two table grape skin extracts (GSEs), Autumn Royal, and Egnatia; 2) the GSEs effects on Caco2 colon cancer cell; 3) the effects of GSEs on the lipid composition and the fluidity of the cell membrane. These in vitro studies suggested that Autumn Royal and Egnatia contain high levels of polyphenols, possess antiproliferative activity on the Caco2 human colon carcinoma cell line and inhibit cell migration by acting on membrane fatty acids composition. Moreover, these results highlighted that the new grape variety Egnatia is an exciting source of phenolic compounds that could interest the food and pharmaceutical industries

    A Cloud-Edge Orchestration Platform for the Innovative Industrial Scenarios of the IoTwins Project

    Get PDF
    The concept of digital twins has growing more and more interest not only in the academic field but also among industrial environments thanks to the fact that the Internet of Things has enabled its cost-effective implementation. Digital twins (or digital models) refer to a virtual representation of a physical product or process that integrate data from various sources such as data APIs, historical data, embedded sensors and open data, giving to the manufacturers an unprecedented view into how their products are performing. The EU-funded IoTwins project plans to build testbeds for digital twins in order to run real-time computation as close to the data origin as possible (e.g., IoT Gateway or Edge nodes), and whilst batch-wise tasks such as Big Data analytics and Machine Learning model training are advised to run on the Cloud, where computing resources are abundant. In this paper, the basic concepts of the IoTwins project, its reference architecture, functionalities and components have been presented and discussed

    Gene expression signature induced by grape intake in healthy subjects reveals wide-spread beneficial effects on peripheral blood mononuclear cells

    Get PDF
    Abstract Using a transcriptomic approach, we performed a pilot study in healthy subjects to evaluate the changes in gene expression induced by grape consumption. Blood from twenty subjects was collected at baseline (T0), after 21 days of grape-rich diet (T1) and after one-month washout (T2). Gene expression profiling of peripheral blood mononuclear cells from six subjects identified 930 differentially expressed transcripts. Gene functional analysis revealed changes (at T1 and/or T2) suggestive of antithrombotic and anti-inflammatory effects, confirming and extending previous finding on the same subjects. Moreover, we observed several other favourable changes in the transcription of genes involved in crucial processes such as immune response, DNA and protein repair, autophagy and mitochondrial biogenesis. Finally, we detected significant changes in many long non-coding RNAs genes, whose regulatory functions are being increasingly appreciated. Altogether, our data suggest that a grape diet may exert its beneficial effects by targeting different strategic pathways

    Analysis of high-identity segmental duplications in the grapevine genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Segmental duplications (SDs) are blocks of genomic sequence of 1-200 kb that map to different loci in a genome and share a sequence identity > 90%. SDs show at the sequence level the same characteristics as other regions of the human genome: they contain both high-copy repeats and gene sequences. SDs play an important role in genome plasticity by creating new genes and modeling genome structure. Although data is plentiful for mammals, not much was known about the representation of SDs in plant genomes. In this regard, we performed a genome-wide analysis of high-identity SDs on the sequenced grapevine (<it>Vitis vinifera</it>) genome (PN40024).</p> <p>Results</p> <p>We demonstrate that recent SDs (> 94% identity and >= 10 kb in size) are a relevant component of the grapevine genome (85 Mb, 17% of the genome sequence). We detected mitochondrial and plastid DNA and genes (10% of gene annotation) in segmentally duplicated regions of the nuclear genome. In particular, the nine highest copy number genes have a copy in either or both organelle genomes. Further we showed that several duplicated genes take part in the biosynthesis of compounds involved in plant response to environmental stress.</p> <p>Conclusions</p> <p>These data show the great influence of SDs and organelle DNA transfers in modeling the <it>Vitis vinifera </it>nuclear DNA structure as well as the impact of SDs in contributing to the adaptive capacity of grapevine and the nutritional content of grape products through genome variation. This study represents a step forward in the full characterization of duplicated genes important for grapevine cultural needs and human health.</p

    INDIGO-DataCloud: A data and computing platform to facilitate seamless access to e-infrastructures

    Get PDF
    This paper describes the achievements of the H2020 project INDIGO-DataCloud. The project has provided e-infrastructures with tools, applications and cloud framework enhancements to manage the demanding requirements of scientific communities, either locally or through enhanced interfaces. The middleware developed allows to federate hybrid resources, to easily write, port and run scientific applications to the cloud. In particular, we have extended existing PaaS (Platform as a Service) solutions, allowing public and private e-infrastructures, including those provided by EGI, EUDAT, and Helix Nebula, to integrate their existing services and make them available through AAI services compliant with GEANT interfederation policies, thus guaranteeing transparency and trust in the provisioning of such services. Our middleware facilitates the execution of applications using containers on Cloud and Grid based infrastructures, as well as on HPC clusters. Our developments are freely downloadable as open source components, and are already being integrated into many scientific applications

    Integration of Apache Mesos over GPUs resources in the DEEP Hybrid DataCloud project

    Get PDF
    Trabajo presentado a IBERGRID: Delivering Innovative Computing and Data services to Researchers, celebrado en Santiago de Compostela (España) del 23 al 26 de septiembre de 2019.DEEP Hybrid DataCloud project was proposed with the necessity to support different amount of intensive computing techniques over specialized hardware, like HPC or GPUs. The project focus on the integration of this specialized, and expensive, hardware under a Cloud Platform as OpenStack that can be used on-demand by researchers of different areas. Within this project, a set of building blocks whose solution is called >DEEP as a Service> was implemented to make the application deployment more easier for the user. For this development, it is necessary to provide the researchers with access to these technologies as friendly but powerful services able to exploit very large datasets. On the one hand we have the gpu resources and on the other hand the users and their applications to run over those resources. DEEP needs to provide a service that controls how users can use those resources in an efficiently way. Although there are multiple technologies that address this problem as a queuing system, Apache Mesos has been developed to do it in an effective and controlled manner. Apache Mesos is a technology that abstracts the resources from different machines, like cpu, gpu, ram and storage to provide a scheduling and distributed system across the whole cloud environment. Mesos is easily to deploy over cpu-based systems but gpu needs a more tricky configuration as it is shown in this solution. As an added value, this solution provides an apache2 configuration for authenticate users from different communities by the current AAI service in DEEP. The proposed presentation will show the design and deployment of Mesos to work over gpus resources inside the Deep Hybrid DataCloud Project scope.This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 777435

    Workshop on Cloud Services for File Synchronisation and Sharing

    No full text
    We have set-up an OpenStack-based cloud infrastructure in the framework of a publicly funded project, PRISMA, aimed at the implementation of a fully integrated PaaS+IaaS platform to provide services in the field of smart-government (e-health, e-government, etc.). The IaaS testbed currently consists of 18 compute nodes providing in total almost 600 cores, 3550 GB of RAM, 400 TB of storage (disks). Connectivity is ensured through 2 NICs, 1Gbit/s and 10Gbit/s. Both the backend (MySQL database and RabbitMq message broker) and the core services (nova, keystone, glance, neutron, etc.) have been configured in high-availability using HA clustering techniques. The full capacity available by 2015 will provide 2000 cores and 8 TB of RAM. In this work we present the storage solutions that we are currently using as backend for our production cloud services. Storage is one of the key components of the cloud stack and can be used both to host the running VMs (“ephemeral” storage), and to host persistent data such as the block devices used by the VMs or users’ archived unstructured data, backups, virtual images, etc.. The storage-as-service is implemented in Openstack by the Block Storage project, Cinder, and the Object Storage project, Swift. Selecting the right software to manage the underlying backend storage for these services is very important and decisions can depend on many factors, not only merely technical, but also economic: in most cases they result from a trade-off between performance and costs. Many operators use separate compute and storage hosts. We decided not to follow this mainstream trend aiming at the best cost-performance scenario: for us it makes sense to run compute and storage on the same machines since we want to be able to dedicate as many of our hosts as possible to running instances. Therefore, each compute node is configured with a significant amount of disk space and a distributed file system (GlusterFS and/or Ceph) ties the disks from each compute node into a single file-system. In this case, the reliability and stability of the shared file-system is critical and defines the effort to maintain the compute hosts: tests have been performed to asses the stability of the shared file-systems changing the replica factor. For example, we observed that GlusterFS in replica 2 cannot be used in production because highly unstable even at moderate storage sizes. Our experience can be useful for all those organizations that have specific constraints in the procurement of a compute cluster or need to deploy on pre-existing servers for which they have little or no control over their specifications. Moreover, the solution we propose is flexible enough, since it is always possible to add external storage when additional storage is required. We currently use GlusterFS distributed file system for: - storage of the running VMs enabling the live migration, - storage of the virtual images (as primary Glance image store), - implementation of one of the Cinder backends for block devices. In particular, we have been using Cinder with LVM-iSCSI driver since Grizzly release when the GlusterFS driver for Cinder did not support advanced features like snapshots and clones, fundamental for our use-cases. In order to exploit GlusterFS advantages even using LVM driver, we created the Cinder volume groups on GlusterFS loopback devices. Upgrading our infrastructure to Havana, we decided to enable Ceph as additional backend of Cinder in order to compare features, reliability and performances of the two solutions. Our interest for Ceph derives also from the possibility to consolidate the infrastructure overall backend storage into a unified solution. To this aim, currently we are testing Ceph to run the Virtual Machines, both using RBD and Ceph-FS protocols, and to implement the object storage. In order to test the scalability and performance of the deployed system using test cases which are derived from the typical pattern of storage utilization. The tools used for testing are standard software widely used for this purpose such as: iozone and/or dd for block storage and specific benchmarking tools like Cosbench, swift-bench and ssbench for the object storage. Using different tools for testing the file-system and comparing their results with the observation of the real test case, is also a good possibility for testing the reliability of the benchmarking tools. Throughput tests have been planned and conducted on the two system configurations in order to understand the performance of both storage solutions and its impacts to applications aiming at achieving the better SLA and end-users experience. Implementing our cloud platform, we focused also on providing transparent access to data using standardized protocols (both de-iure and de-facto standards). In particular, Amazon-compliant S3 and the CDMI (Cloud Data Management Interface) interfaces have been installed on top of the Swift Object Storage in order to promote interoperability also at PaaS/SaaS levels. Data is important for businesses of all sizes. Therefore, one of the most common user requirement is the possibility to backup data in order to minimize their loss, stay compliant and preserve data integrity. Implementing this feature is particularly challenging when the users come from the public administrations and the scientific communities that produce huge quantities of heterogeneous data and/or can have strict constraints. An interesting feature of the Swift Object Storage is the geographic replica that can be used in order to add a disaster-recovery feature to the set of data and services exposed by our infrastructure. Also Ceph provides a similar feature: the geo-replication through RADOS gateway. Therefore, we have installed and configured both a Swift global cluster and a Ceph federated cluster, distributed on three different geographic sites. Results of the performance tests conducted on both clusters are presented along with a description of the parameters tuning that has been performed for optimization. The different replication methods implemented in the two middlewares, Swift and Ceph, are compared in terms of network traffic bandwidth, cpu and memory consumption. Another important aspect we are taking care of is the QoS (Quality of Service) support, i.e. the capability of providing different levels of storage service optimized wrt the user application profile. This can be achieved defining different tiers of storage and setting parameters like how many I/Os the storage can handle, what limit it should have on latency, what availability levels it should offer and so on. Our final goal is also to set-up a (semi-)automated system that is able of self-optimising. Therefore we are exploring the cache tiering feature of Ceph, that handles the migration of data between the cache tier and the backing storage tier automatically. Results of these testing activities will be shown too in this presentation

    Developing a monitoring system for Cloud-based distributed data-centers

    No full text
    Nowadays more and more datacenters cooperate each others to achieve a common and more complex goal. New advanced functionalities are required to support experts during recovery and managing activities, like anomaly detection and fault pattern recognition. The proposed solution provides an active support to problem solving for datacenter management teams by providing automatically the root-cause of detected anomalies. The project has been developed in Bari using the datacenter ReCaS as testbed. Big Data solutions have been selected to properly handle the complexity and size of the data. Features like open source, big community, horizontal scalability and high availability have been considered and tools belonging to the Hadoop ecosystem have been selected. The collected information is sent to a combination of Apache Flume and Apache Kafka, used as transport layer, in turn delivering data to databases and processing components. Apache Spark has been selected as analysis component. Different kind of databases have been considered in order to satisfy multiple requirements: Hadoop Distributed File System, Neo4j, InfluxDB and Elasticsearch. Grafana and Kibana are used to show data in a dedicated dashboards. The Root-cause analysis engine has been implemented using custom machine learning algorithms. Finally, results are forwarded to experts by email or Slack, using Riemann

    Developing a monitoring system for Cloud-based distributed data-centers

    Get PDF
    Nowadays more and more datacenters cooperate each others to achieve a common and more complex goal. New advanced functionalities are required to support experts during recovery and managing activities, like anomaly detection and fault pattern recognition. The proposed solution provides an active support to problem solving for datacenter management teams by providing automatically the root-cause of detected anomalies. The project has been developed in Bari using the datacenter ReCaS as testbed. Big Data solutions have been selected to properly handle the complexity and size of the data. Features like open source, big community, horizontal scalability and high availability have been considered and tools belonging to the Hadoop ecosystem have been selected. The collected information is sent to a combination of Apache Flume and Apache Kafka, used as transport layer, in turn delivering data to databases and processing components. Apache Spark has been selected as analysis component. Different kind of databases have been considered in order to satisfy multiple requirements: Hadoop Distributed File System, Neo4j, InfluxDB and Elasticsearch. Grafana and Kibana are used to show data in a dedicated dashboards. The Root-cause analysis engine has been implemented using custom machine learning algorithms. Finally, results are forwarded to experts by email or Slack, using Riemann
    corecore